Context-Free Grammars are just one kind of grammar. In this section we briefly consider an even more restricted kind of grammar, known as a Regular Grammar.
As one would expect, the language of a Regular Grammar is a Regular Language. In fact, any regular language has a regular grammar, so now we have a four ways to show a language is regular: find an NFA, a DFA, a regular expression, or a regular grammar.
Definition
A Regular Grammar is a context-free grammar in which all rules have one of the following forms: > | Production Shape | (In English) | | βββββ- | ββββββββββββββ | | \(X\to aY\) | Nonterminal goes to terminal + nonterminal | | \(X\to Y\) | Nonterminal goes to nonterminal | | \(X\to a\) | Nonterminal goes to terminal | | \(X\to \epsilon\) | Nonterminal goes to empty string |
Example
The grammar >\[ >\begin{array}{lcl} > S & \to & 0S\ |\ 1A\\ > A & \to & 1A\ |\ \epsilon\\ > \end{array} >\] >is a regular grammar (as well as being a context-free grammar).
Suppose we have a regular language \(L\). There must be an NFA for \(L\); it is fairly straightforward to turn an NFA recognizing \(L\) into a regular grammar that can produce any string in \(L\).
The trick is to have one nonterminal for each state in the NFA. Each nonterminal will produce all the strings in the language of the corresponding NFA state (i.e., the strings that would be accepted if we started at that NFA state instead of at the start state.) The definitions of these nonterminals can then be read off from the transitions of the NFA.
Example
Consider the following state machine:
> If we look at state \(X\), we see that the strings that would be
accepted starting at state \(X\) either: > >- start with an \(a\) and have a remainder that can be
accepted starting from state \(B\), or
else >- start with \(b\) and
continue a remainder that can be accepted starting from state \(X\). > Similarly, the strings we can
accept starting from state \(B\) either
has an \(a\) or \(b\) followed by a string that can accept
starting from state \(X\), or else is
the empty string (which would accept since \(B\) is an accepting state). > In this
fashion, we can construct a (regular!) grammar for the languages of all
the states: >\[
>\begin{array}{lcl}
> A & \to & aX\ |\ bA\\
> X & \to & aB\ |\ bX\\
> B & \to & aX\ |\ bX\ |\ \epsilon\\
> \end{array}
>\] and since the language of a state machine is (by
definition) the language of its start state, this grammar (where \(A\) is the start symbol) produces exactly
the strings in the language of this state machine.
One can also go the other direction; any regular grammar can be directly translated into an NFA with one state per nonterminal:
Example
The regular grammar >\[
>\begin{array}{lcl}
> S & \to & 0S\ |\ 1A\\
> A & \to & 1A\ |\ \epsilon\\
> \end{array}
>\] can be translated into the following NFA:
> Note that
the state corresponding to start symbol \(S\) is the start state, and state(s) that
can produce the empty string are accepting states.
Every regular grammar is also a context-free grammar, so every regular language is also a context-free language.
Similarly, a grammar can be context-free but not regular, and a language can be context-free but not regular.
What is slightly less obvious is that a non-regular grammar can describe a regular language!
Example
The grammar >\[ >\begin{array}{lcl} > S & \to & SS\ |\ 1\\ > \end{array} >\] is not a regular grammar (the rule \(S\to SS\) is not one of the allowable forms for a regular grammar), but the language of this grammar is \(\{\ 1^n\ |\ n\geq 1\ \ \}\), which is easily seen to be regular. > How is this possible? Well, a language is regular iff it can be described by at least one regular grammar, e.g., >\[ >\begin{array}{lcl} > S & \to & 1S\ |\ 1\\ > \end{array} >\] The definition of regular language doesnβt say anything about whether there might be non-regular (βunnecessarily complexβ) grammars for the language too.
Previous: 6.14 CYK Algorithm